fix(audio): Normalize 'x-wav' audio format to 'wav' #9017

akshatvishu · 2025-11-04T13:17:37Z

Description:

The dspy.Audio.from_file (and from_url) method relies on Python's mimetypes.guess_type() to determine the audio format. On some operating systems, this function can return non-standard MIME types, such as audio/x-wav for .wav files.

These non-standard format strings, often prefixed with x- (like x-wav or x-m4a), are then passed to the LLM API (e.g., OpenAI). This can cause a 400 BadRequestError, as the API typically only accepts compliant formats (e.g., wav, m4a).

This patch adds a check to from_file, from_url, and the data URI branch of encode_audio to normalize these formats by removing any x- prefix, ensuring an API-compliant format is always sent.

dspy/adapters/types/audio.py

…rd equivalents

dspy/adapters/types/audio.py

TomeHirata · 2025-11-06T04:14:53Z

Thanks @akshatvishu, can you add a unit test?

…e cases

akshatvishu · 2025-11-06T15:44:44Z

@TomeHirata Added the unit-test and also I slightly changed the logic and used the removeprefix() instead of the replace() to safely remove only the prefix from audio format strings, preventing unintended replacements if "x-" appears elsewhere in the format.

TomeHirata

LGTM

fix(audio): Normalize 'x-wav' audio format to 'wav'

91e7bad

TomeHirata reviewed Nov 5, 2025

View reviewed changes

dspy/adapters/types/audio.py Outdated Show resolved Hide resolved

fix(audio): Normalize all 'x-' prefixed audio formats to their standa…

28eebf6

…rd equivalents

TomeHirata reviewed Nov 6, 2025

View reviewed changes

dspy/adapters/types/audio.py Outdated Show resolved Hide resolved

akshatvishu added 6 commits November 6, 2025 19:34

refactor(audio): use removeprefix for safer audio format normalization

9a104be

refactor(audio): centralize audio format normalization logic

ea712f2

test: add unit tests for audio format normalization

f17ac7f

test/audio: update audio format normalization tests with comprehensiv…

2588c6c

…e cases

style: clean up test file comments

eeb5bfa

style: fix spelling error at audio.py

9a86f4a

TomeHirata approved these changes Nov 10, 2025

View reviewed changes

TomeHirata merged commit b3c6350 into stanfordnlp:main Nov 10, 2025
10 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix(audio): Normalize 'x-wav' audio format to 'wav' #9017

fix(audio): Normalize 'x-wav' audio format to 'wav' #9017

Uh oh!

akshatvishu commented Nov 4, 2025 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

TomeHirata commented Nov 6, 2025

Uh oh!

akshatvishu commented Nov 6, 2025 •

edited

Loading

Uh oh!

TomeHirata left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

fix(audio): Normalize 'x-wav' audio format to 'wav' #9017

fix(audio): Normalize 'x-wav' audio format to 'wav' #9017

Uh oh!

Conversation

akshatvishu commented Nov 4, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description:

Uh oh!

Uh oh!

Uh oh!

TomeHirata commented Nov 6, 2025

Uh oh!

akshatvishu commented Nov 6, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

TomeHirata left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

akshatvishu commented Nov 4, 2025 •

edited

Loading

akshatvishu commented Nov 6, 2025 •

edited

Loading